Dealing with deletion errors in MT

نویسنده

  • Jacob Devlin
چکیده

Content word deletion has widely been identified as a common error in statistical machine translation systems. We refer to a translation error as “content word deletion” when a content word appears in the reference translation of some sentence, but is absent from the system output translation of that sentence. For our purposes, the term “content word” does not have a precise definition, but represents any word that we judge to carry semantic meaning in the sentence. We hypothesize that a major source of content word deletion is the use of translation rules that contain unaligned content words on the source side. The following is a rule of this type:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی فراوانی جهش های DNA میتوکندریایی در دیابت نوع دو

Background: Mitochondria is one of the intracellular organelle with specific DNA. Some diseases caused by mtDNA mutations have been reported up to now. Mutation of A3243G and deletion of 5kb are two of them that related to Diabetes type II. The aim of this study was to evaluate the frequency of A3243G mutation and 5kb mt DNA deletion in type II diabetic patients.Methods: The DNA extracted from...

متن کامل

Role of Mitochondria in Ataxia-Telangiectasia: Investigation of Mitochondrial Deletions and Haplogroups

Ataxia-Telangiectasia (AT) is a rare human neurodegenerative autosomal recessive multisystem disease that is characterized by a wide range of features including, progressive cerebellar ataxia with onset during infancy, occulocutaneous telangiectasia, susceptibility to neoplasia, occulomotor disturbances, chromosomal instability and growth and developmental abnormalities. Mitochondrial DNA (mtDN...

متن کامل

A de novo Deletion of Chromosome 18p With Persistent Limb Tremor and Difficulty Speaking: A Case Report

The common causes of 18p deletion syndrome are spontaneous errors in the chromosomal structure in the early stages of human embryonic development. In this study, a 29-year-old girl was introduced with the features of deletion of chromosome 18. In addition, GTG banding karyotype revealed that this case had a deletion involving the short arm of chromosome 18. In comparison with the usual phenotyp...

متن کامل

Are Unaligned Words Important for Machine Translation ?

In this paper, we deal with the problem of a large number of unaligned words in automatically learned word alignments for machine translation (MT). These unaligned words are the reason for ambiguous phrase pairs extracted by a statistical phrase-based MT system. In translation, this phrase ambiguity causes deletion and insertion errors. We present hard and optional deletion approaches to remove...

متن کامل

Liu Estimates and Influence Analysis in Regression Models with Stochastic Linear Restrictions and AR (1) Errors

In the linear regression models with AR (1) error structure when collinearity exists, stochastic linear restrictions or modifications of biased estimators (including Liu estimators) can be used to reduce the estimated variance of the regression coefficients estimates. In this paper, the combination of the biased Liu estimator and stochastic linear restrictions estimator is considered to overcom...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008